Overview

Dataset statistics

Number of variables14
Number of observations786600
Missing cells24767
Missing cells (%)0.2%
Duplicate rows546
Duplicate rows (%)0.1%
Total size in memory90.0 MiB
Average record size in memory120.0 B

Variable types

NUM10
CAT2
BOOL2

Warnings

Dataset has 546 (0.1%) duplicate rows Duplicates
customer_id has a high cardinality: 245455 distinct values High cardinality
order_date has a high cardinality: 776 distinct values High cardinality
customer_order_rank has 24767 (3.1%) missing values Missing
voucher_amount is highly skewed (γ1 = 30.39394065) Skewed
platform_id is highly skewed (γ1 = -22.53663783) Skewed
voucher_amount has 743462 (94.5%) zeros Zeros
delivery_fee has 597536 (76.0%) zeros Zeros

Reproduction

Analysis started2020-10-13 20:37:34.905650
Analysis finished2020-10-13 20:43:18.293751
Duration5 minutes and 43.39 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

customer_id
Categorical

HIGH CARDINALITY

Distinct245455
Distinct (%)31.2%
Missing0
Missing (%)0.0%
Memory size6.0 MiB
15edce943edd
 
386
8745a335e9cf
 
288
d956116d863d
 
286
0063666607bb
 
273
ae60dce05485
 
270
Other values (245450)
785097 
ValueCountFrequency (%) 
15edce943edd386< 0.1%
 
8745a335e9cf288< 0.1%
 
d956116d863d286< 0.1%
 
0063666607bb273< 0.1%
 
ae60dce05485270< 0.1%
 
a54a8e1579d4254< 0.1%
 
bebb751d49b8253< 0.1%
 
26ed6389a3aa245< 0.1%
 
ef6265f74aca229< 0.1%
 
a333fb175a0c221< 0.1%
 
Other values (245445)78389599.7%
 
2020-10-13T23:43:22.590775image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique145498 ?
Unique (%)18.5%
2020-10-13T23:43:23.256937image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length12
Median length12
Mean length12
Min length12

order_date
Categorical

HIGH CARDINALITY

Distinct776
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size6.0 MiB
2017-01-01
 
4230
2016-12-18
 
3395
2017-02-26
 
3234
2017-02-05
 
3218
2017-02-12
 
3125
Other values (771)
769398 
ValueCountFrequency (%) 
2017-01-0142300.5%
 
2016-12-1833950.4%
 
2017-02-2632340.4%
 
2017-02-0532180.4%
 
2017-02-1231250.4%
 
2016-12-1131000.4%
 
2016-12-0430750.4%
 
2017-01-2230050.4%
 
2017-01-2930030.4%
 
2016-10-0329990.4%
 
Other values (766)75421695.9%
 
2020-10-13T23:43:23.861425image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique41 ?
Unique (%)< 0.1%
2020-10-13T23:43:24.423039image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length10
Median length10
Mean length10
Min length10

order_hour
Real number (ℝ≥0)

Distinct24
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean17.58879608
Minimum0
Maximum23
Zeros4627
Zeros (%)0.6%
Memory size6.0 MiB
2020-10-13T23:43:24.957807image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile12
Q116
median18
Q320
95-th percentile22
Maximum23
Range23
Interquartile range (IQR)4

Descriptive statistics

Standard deviation3.357192477
Coefficient of variation (CV)0.1908710785
Kurtosis5.749711941
Mean17.58879608
Median Absolute Deviation (MAD)2
Skewness-1.749088644
Sum13835347
Variance11.27074133
MonotocityNot monotonic
2020-10-13T23:43:25.547523image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=24)
ValueCountFrequency (%) 
1913403017.0%
 
1812965416.5%
 
2010873913.8%
 
179078211.5%
 
21682238.7%
 
16488776.2%
 
15342864.4%
 
22334034.2%
 
13311054.0%
 
14303233.9%
 
Other values (14)771789.8%
 
ValueCountFrequency (%) 
046270.6%
 
124250.3%
 
211870.2%
 
34430.1%
 
4137< 0.1%
 
ValueCountFrequency (%) 
23138321.8%
 
22334034.2%
 
21682238.7%
 
2010873913.8%
 
1913403017.0%
 

customer_order_rank
Real number (ℝ≥0)

MISSING

Distinct369
Distinct (%)< 0.1%
Missing24767
Missing (%)3.1%
Infinite0
Infinite (%)0.0%
Mean9.436809642
Minimum1
Maximum369
Zeros0
Zeros (%)0.0%
Memory size6.0 MiB
2020-10-13T23:43:26.340350image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median3
Q310
95-th percentile39
Maximum369
Range368
Interquartile range (IQR)9

Descriptive statistics

Standard deviation17.77232218
Coefficient of variation (CV)1.88329773
Kurtosis49.04720204
Mean9.436809642
Median Absolute Deviation (MAD)2
Skewness5.494014541
Sum7189273
Variance315.8554356
MonotocityNot monotonic
2020-10-13T23:43:27.292820image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
124493731.1%
 
29664112.3%
 
3605327.7%
 
4436815.6%
 
5340364.3%
 
6276033.5%
 
7230492.9%
 
8196962.5%
 
9170132.2%
 
10148891.9%
 
Other values (359)17975622.9%
 
(Missing)247673.1%
 
ValueCountFrequency (%) 
124493731.1%
 
29664112.3%
 
3605327.7%
 
4436815.6%
 
5340364.3%
 
ValueCountFrequency (%) 
3691< 0.1%
 
3681< 0.1%
 
3671< 0.1%
 
3661< 0.1%
 
3651< 0.1%
 

is_failed
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size6.0 MiB
0
761833 
1
 
24767
ValueCountFrequency (%) 
076183396.9%
 
1247673.1%
 
2020-10-13T23:43:27.958366image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

voucher_amount
Real number (ℝ≥0)

SKEWED
ZEROS

Distinct911
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.09148909292
Minimum0
Maximum93.3989
Zeros743462
Zeros (%)94.5%
Memory size6.0 MiB
2020-10-13T23:43:29.965859image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0.686
Maximum93.3989
Range93.3989
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.4795579176
Coefficient of variation (CV)5.241694963
Kurtosis3886.352852
Mean0.09148909292
Median Absolute Deviation (MAD)0
Skewness30.39394065
Sum71965.32049
Variance0.2299757963
MonotocityNot monotonic
2020-10-13T23:43:30.574518image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
074346294.5%
 
1.029116471.5%
 
1.715111341.4%
 
2.05891221.2%
 
0.68636480.5%
 
1.37217700.2%
 
2.74411920.2%
 
2.57258970.1%
 
3.435430.1%
 
0.5145373< 0.1%
 
Other values (901)28120.4%
 
ValueCountFrequency (%) 
074346294.5%
 
0.0034335< 0.1%
 
0.284691< 0.1%
 
0.322421< 0.1%
 
0.34319< 0.1%
 
ValueCountFrequency (%) 
93.39891< 0.1%
 
78.029071< 0.1%
 
68.39421< 0.1%
 
61.825751< 0.1%
 
37.575651< 0.1%
 

delivery_fee
Real number (ℝ≥0)

ZEROS

Distinct98
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.1811799318
Minimum0
Maximum9.86
Zeros597536
Zeros (%)76.0%
Memory size6.0 MiB
2020-10-13T23:43:31.315734image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0.986
Maximum9.86
Range9.86
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.3697095668
Coefficient of variation (CV)2.040565769
Kurtosis8.481347092
Mean0.1811799318
Median Absolute Deviation (MAD)0
Skewness2.417459196
Sum142516.1343
Variance0.1366851638
MonotocityNot monotonic
2020-10-13T23:43:32.011444image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
059753676.0%
 
0.493706179.0%
 
0.986357354.5%
 
0.7395347904.4%
 
0.246576641.0%
 
1.232571640.9%
 
1.47967680.9%
 
1.429750780.6%
 
0.4683530970.4%
 
0.443726570.3%
 
Other values (88)154942.0%
 
ValueCountFrequency (%) 
059753676.0%
 
0.0246510< 0.1%
 
0.04933< 0.1%
 
0.09864< 0.1%
 
0.1479303< 0.1%
 
ValueCountFrequency (%) 
9.861< 0.1%
 
7.3951< 0.1%
 
6.65551< 0.1%
 
6.4091< 0.1%
 
5.9161< 0.1%
 

amount_paid
Real number (ℝ≥0)

Distinct6471
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10.18327131
Minimum0
Maximum1131.03
Zeros872
Zeros (%)0.1%
Memory size6.0 MiB
2020-10-13T23:43:32.726345image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile4.5135
Q16.64812
median9.027
Q312.213
95-th percentile19.5408
Maximum1131.03
Range1131.03
Interquartile range (IQR)5.56488

Descriptive statistics

Standard deviation5.6181212
Coefficient of variation (CV)0.5517010233
Kurtosis2243.912588
Mean10.18327131
Median Absolute Deviation (MAD)2.655
Skewness15.5881411
Sum8010161.21
Variance31.56328582
MonotocityNot monotonic
2020-10-13T23:43:33.447739image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
5.31146671.9%
 
7.965144101.8%
 
6.372118781.5%
 
8.496103501.3%
 
6.90399881.3%
 
5.84197341.2%
 
9.02792131.2%
 
7.43491561.2%
 
10.6289821.1%
 
9.55883771.1%
 
Other values (6461)67984586.4%
 
ValueCountFrequency (%) 
08720.1%
 
0.005311< 0.1%
 
0.015931< 0.1%
 
0.026551< 0.1%
 
0.037171< 0.1%
 
ValueCountFrequency (%) 
1131.031< 0.1%
 
581.71051< 0.1%
 
363.018151< 0.1%
 
353.38051< 0.1%
 
246.888451< 0.1%
 

restaurant_id
Real number (ℝ≥0)

Distinct13569
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean162864079.3
Minimum73498
Maximum340453498
Zeros0
Zeros (%)0.0%
Memory size6.0 MiB
2020-10-13T23:43:34.024902image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum73498
5-th percentile29803498
Q186023498
median169613498
Q3228433498
95-th percentile302393498
Maximum340453498
Range340380000
Interquartile range (IQR)142410000

Descriptive statistics

Standard deviation87830821.23
Coefficient of variation (CV)0.5392890906
Kurtosis-1.08595334
Mean162864079.3
Median Absolute Deviation (MAD)71240000
Skewness-0.02254910338
Sum1.281088848e+14
Variance7.714253157e+15
MonotocityNot monotonic
2020-10-13T23:43:34.769401image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
3762349813170.2%
 
98349810710.1%
 
19267349810310.1%
 
1545434989990.1%
 
887734989670.1%
 
1467234989420.1%
 
1052534989350.1%
 
186034989220.1%
 
306334989180.1%
 
295934988820.1%
 
Other values (13559)77661698.7%
 
ValueCountFrequency (%) 
73498120< 0.1%
 
12349837< 0.1%
 
153498193< 0.1%
 
173498181< 0.1%
 
19349884< 0.1%
 
ValueCountFrequency (%) 
3404534981< 0.1%
 
3400934982< 0.1%
 
3400334981< 0.1%
 
3399834982< 0.1%
 
3399134981< 0.1%
 

city_id
Real number (ℝ≥0)

Distinct3749
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean47179.7505
Minimum230
Maximum100205
Zeros0
Zeros (%)0.0%
Memory size6.0 MiB
2020-10-13T23:43:35.593751image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum230
5-th percentile10346
Q124799
median46467
Q367886
95-th percentile89749
Maximum100205
Range99975
Interquartile range (IQR)43087

Descriptive statistics

Standard deviation25904.63056
Coefficient of variation (CV)0.5490624747
Kurtosis-1.018564164
Mean47179.7505
Median Absolute Deviation (MAD)21419
Skewness0.05185593619
Sum3.711159174e+10
Variance671049884.7
MonotocityNot monotonic
2020-10-13T23:43:36.248924image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
103468665411.0%
 
20326362104.6%
 
80562341004.3%
 
50898216272.7%
 
40441167322.1%
 
60537147601.9%
 
44366141191.8%
 
45358112461.4%
 
4334111061.4%
 
90633104491.3%
 
Other values (3739)52959767.3%
 
ValueCountFrequency (%) 
2309930.1%
 
129865190.8%
 
167677< 0.1%
 
168533< 0.1%
 
168918< 0.1%
 
ValueCountFrequency (%) 
1002051< 0.1%
 
1000791< 0.1%
 
1000613< 0.1%
 
10004856< 0.1%
 
999995< 0.1%
 

payment_id
Real number (ℝ≥0)

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1668.509077
Minimum1491
Maximum1811
Zeros0
Zeros (%)0.0%
Memory size6.0 MiB
2020-10-13T23:43:36.745796image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1491
5-th percentile1523
Q11619
median1619
Q31779
95-th percentile1779
Maximum1811
Range320
Interquartile range (IQR)160

Descriptive statistics

Standard deviation87.19266546
Coefficient of variation (CV)0.05225783105
Kurtosis-1.011622604
Mean1668.509077
Median Absolute Deviation (MAD)0
Skewness0.2658271582
Sum1312449240
Variance7602.56091
MonotocityNot monotonic
2020-10-13T23:43:37.250709image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=5)
ValueCountFrequency (%) 
161947660060.6%
 
177923413329.8%
 
1491364974.6%
 
1811344924.4%
 
152348780.6%
 
ValueCountFrequency (%) 
1491364974.6%
 
152348780.6%
 
161947660060.6%
 
177923413329.8%
 
1811344924.4%
 
ValueCountFrequency (%) 
1811344924.4%
 
177923413329.8%
 
161947660060.6%
 
152348780.6%
 
1491364974.6%
 

platform_id
Real number (ℝ≥0)

SKEWED

Distinct14
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean29868.52938
Minimum525
Maximum30423
Zeros0
Zeros (%)0.0%
Memory size6.0 MiB
2020-10-13T23:43:37.799114image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum525
5-th percentile29463
Q129463
median29815
Q330231
95-th percentile30359
Maximum30423
Range29898
Interquartile range (IQR)768

Descriptive statistics

Standard deviation1160.893265
Coefficient of variation (CV)0.03886677012
Kurtosis565.3036862
Mean29868.52938
Median Absolute Deviation (MAD)352
Skewness-22.53663783
Sum2.349458521e+10
Variance1347673.174
MonotocityNot monotonic
2020-10-13T23:43:38.426999image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%) 
2946324152330.7%
 
3023121672627.6%
 
2981515897220.2%
 
3035910365313.2%
 
30391244343.1%
 
29751193212.5%
 
29495111511.4%
 
3042368190.9%
 
3019920790.3%
 
52510940.1%
 
Other values (4)8280.1%
 
ValueCountFrequency (%) 
52510940.1%
 
221673< 0.1%
 
22263232< 0.1%
 
222951< 0.1%
 
2946324152330.7%
 
ValueCountFrequency (%) 
3042368190.9%
 
30391244343.1%
 
3035910365313.2%
 
3023121672627.6%
 
3019920790.3%
 

transmission_id
Real number (ℝ≥0)

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4253.246112
Minimum212
Maximum21124
Zeros0
Zeros (%)0.0%
Memory size6.0 MiB
2020-10-13T23:43:38.973802image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum212
5-th percentile4228
Q14228
median4324
Q34356
95-th percentile4356
Maximum21124
Range20912
Interquartile range (IQR)128

Descriptive statistics

Standard deviation572.8556657
Coefficient of variation (CV)0.1346866959
Kurtosis176.6261099
Mean4253.246112
Median Absolute Deviation (MAD)32
Skewness-0.9114324558
Sum3345603392
Variance328163.6137
MonotocityNot monotonic
2020-10-13T23:43:39.482695image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
435634173443.4%
 
432420366825.9%
 
422820161725.6%
 
4260145381.8%
 
212126761.6%
 
499667370.9%
 
419652760.7%
 
1988207< 0.1%
 
21124146< 0.1%
 
20201< 0.1%
 
ValueCountFrequency (%) 
212126761.6%
 
1988207< 0.1%
 
20201< 0.1%
 
419652760.7%
 
422820161725.6%
 
ValueCountFrequency (%) 
21124146< 0.1%
 
499667370.9%
 
435634173443.4%
 
432420366825.9%
 
4260145381.8%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size6.0 MiB
1
408889 
0
377711 
ValueCountFrequency (%) 
140888952.0%
 
037771148.0%
 
2020-10-13T23:43:39.819159image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Interactions

2020-10-13T23:39:40.011736image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:39:41.669214image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:39:43.300047image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:39:44.751790image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:39:46.422814image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:39:48.367870image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:39:50.099853image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:39:52.606377image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:39:54.733768image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:39:56.638513image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:39:58.447086image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:40:00.068682image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:40:01.995419image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:40:06.613111image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:40:09.606006image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:40:16.534104image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:40:22.628800image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:40:26.003366image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:40:28.745875image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:40:33.586651image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:40:36.542392image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:40:39.260908image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:40:44.768106image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:40:49.411208image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:40:51.762203image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:40:53.750293image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:40:55.952283image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:40:58.338123image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:41:00.407363image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:41:02.227146image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:41:04.494126image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:41:06.004810image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:41:07.547203image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:41:09.098537image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:41:10.732900image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:41:12.237416image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:41:14.100584image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:41:15.534072image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:41:16.676464image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:41:17.904477image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:41:19.213741image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:41:22.012210image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:41:23.232903image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:41:24.478490image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:41:26.228152image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:41:27.799672image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:41:29.128898image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:41:30.321926image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:41:31.591723image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:41:33.038070image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:41:34.674135image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:41:36.285819image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:41:37.704867image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:41:39.236993image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:41:41.242929image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:41:42.962199image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:41:44.409432image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:41:45.774929image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:41:47.229210image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:41:48.536040image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:41:49.651597image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:41:50.596100image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:41:51.699593image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:41:52.827424image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:41:54.468329image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:41:55.815433image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:41:57.020207image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:41:58.399364image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:41:59.596666image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:42:00.938072image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:42:02.485758image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:42:03.853386image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:42:05.174588image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:42:06.176544image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:42:07.401442image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:42:09.032091image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:42:10.226816image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:42:12.449629image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:42:13.601006image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:42:15.020329image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:42:16.323458image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:42:17.597008image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:42:18.890435image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:42:20.015452image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:42:21.260577image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:42:22.458942image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:42:23.423733image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:42:24.473846image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:42:25.689714image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:42:26.815186image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:42:27.914833image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:42:28.880827image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:42:29.864635image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:42:30.853497image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:42:31.821492image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:42:32.966681image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:42:33.988220image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:42:34.905225image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:42:35.818913image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:42:36.803932image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Correlations

2020-10-13T23:43:40.230981image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2020-10-13T23:43:41.517070image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2020-10-13T23:43:42.620453image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2020-10-13T23:43:43.732699image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2020-10-13T23:42:42.333019image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:42:58.073885image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-13T23:43:15.270921image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Sample

First rows

customer_idorder_dateorder_hourcustomer_order_rankis_failedvoucher_amountdelivery_feeamount_paidrestaurant_idcity_idpayment_idplatform_idtransmission_idis_returning_customer
0000097eabfd92015-06-20191.000.00.00011.4696058034982032617793023143560
10000e2c6d9be2016-01-29201.000.00.0009.558002393034987654716193035943560
2000133bb597f2017-02-26191.000.00.4935.936582064634983383316193035943241
300018269939b2017-02-05171.000.00.4939.82350366134989931516193035943560
40001a00468a62015-08-04191.000.00.4935.150702258534981645616192946343560
50001d9036b5e2015-08-29191.000.00.00011.947501936434988827616192946343560
60001d9036b5e2017-01-04172.000.00.00011.151001936434988827616192946343560
70001d9036b5e2017-01-28163.000.00.0009.717301936434988827616193035943560
80001e1e04d7d2015-10-24191.000.00.00025.222501448334984535816192946343561
90001e1e04d7d2016-03-24192.000.00.0009.29250959534984535816192946343241

Last rows

customer_idorder_dateorder_hourcustomer_order_rankis_failedvoucher_amountdelivery_feeamount_paidrestaurant_idcity_idpayment_idplatform_idtransmission_idis_returning_customer
786590fffcf45e5c692016-11-19121.000.00.000012.531601074634983933516192946343560
786591fffcf45e5c692017-02-04122.000.00.000011.575801074634983933516193035943560
786592fffd696eaedd2015-09-14121.000.01.429724.13395953234988056217792946343560
786593fffe9d5a8d412016-07-3121NaN10.00.00008.44290156133498103461811294632121
786594fffe9d5a8d412016-09-30201.000.00.000010.726209834981034617792946342281
786595fffe9d5a8d412016-09-3020NaN10.00.000010.72620983498103461779294632121
786596ffff347c3cfa2016-08-17211.000.00.00007.59330528934984197816193035943561
786597ffff347c3cfa2016-09-15212.000.00.00005.947201646534984197816193035943561
786598ffff4519b52d2016-04-02191.000.00.000021.77100163634988056214912975142280
786599ffffccbfc8a42015-05-30201.000.00.000016.461001502934984595216192946343240